Kobe University and Muroran Institute of Technology at TRECVID 2012 Semantic Indexing Task
نویسندگان
چکیده
This paper describes our method developed for TRECVID 2012 Semantic INdexing (SIN) Task. Our main research purpose is the development of a fast method, which can work on a single processor with no performance degradation. To this end, computationally expensive processes are re-formulated based on matrix operation. We re-formulate the Euclidian distance computation for the kernel value computation in an SVM, and the probability density computation of multivariate normal distributions for the GMM supervector representation. This enables accurate concept detection using a large number of training examples, and spatially-temporally dense features. The following four runs were submitted to SIN (light) task: • L A kobe muro l5 4: This is our baseline run using five features, 1. SIFT at Harris-Affine regions, 2. SIFT at Hessian-Affine regions, 3. Trajectory displacement, 4. HOG around trajectories, and 5. MFCC. Five SVMs built on these features are fused using the weighted linear combination approach. This run achieved the MAP 0.320. • L A kobe muro l6 1: In addition to the above five features, this run uses the sixth feature, Spatio-TemporalDense RGB SIFT (STD-RGB-SIFT), consisting of SIFT descriptors sampled at every sixth pixel in every other frame. The extraction of this feature becomes feasible because of the significant speedup of the probability density computation. This run achieved the MAP 0.348. • L A kobe muro l18 3: To cover the diversity of a concept’s appearances, this run utilizes bagging where three SVMs are built on each of six features using different subsets of training examples. Such many SVMs can be built due to the fast kernel value computation. SVMs are fused using the weighted linear combination. This run achieved the MAP 0.358, which is the highest score among all the runs submitted to SIN (light) task. • L A kobe muro r18 2: This run uses rough set theory to fuse SVMs in L A kobe muro l18 3, and achieved the MAP 0.323. The above results indicate the effectiveness of the spatiallytemporally dense feature STD-RGB-SIFT. In particular, the MAP 0.302 was achieved only using STD-RGB-SIFT. This is significantly higher than MAPs using the other single features. Also, the effectiveness of bagging can be seen from the above results.
منابع مشابه
University of Siegen , Kobe University and Muroran Institute of Technology at TRECVID 2013 Multimedia Event Detection
This paper presents our method developed for TRECVID 2013 Multimedia Event Detection task. The following two problems are mainly addressed: The first is weakly supervised setting where training videos contain many shots irrelevant to a target event. The other is the diversity of visual appearances, meaning that shots relevant to the event are characterised by significantly different visual appe...
متن کاملFudan University at TRECVID 2010 : Semantic Indexing
In this notebook paper we describe our participation in the NIST TRECVID 2010 evaluation. We took part in semantic indexing task of benchmark this year. For semantic indexing, we submitted 3 automatic runs using only IACC training data: Fudan.TV10.3: this run is based on visual features of keyframes. Fudan.TV10.2: this run is based on visual features of keyframes and object detection. Fudan.TV1...
متن کاملJOANNEUM RESEARCH and Vienna University of Technology at TRECVID 2010
We participated in two tasks: semantic indexing (SIN) and instance search (INS).
متن کاملKobe University at TRECVID 2011 Semantic Indexing and Multimedia Event Detection
This paper describes our methods and experimental results for TRECVID 2011 SIN and MED tasks. For SIN task, we submitted the run L A cs24 kobe sin 1 that addresses the following two problems: The first one is an expensive computation cost for constructing an SVM with a large number of examples. To ensure the detection accuracy and speed for each concept, we developed a method that selects a sma...
متن کامل